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PREFACE 


This  technical  report  is  an  invited  chapter  for  the  Handbook  of  Statistics : 
Nonparanetric  Methods ,  Volume  4  in  a  series  edited  by  P.  R.  Krishnaiah  and 
P.  K.  Sen  and  to  be  pi* li shed  by  North-Holland  Publishing  Company,  Amsterdam. 
Much  of  the  material  by  the  author  used  in  the  chapter  was  developed  under 
ONR- sponsored  research  at  the  Florida  State  University  and  earlier  at  the 
Virginia  Polytechnic  Institute  and  State  University.  Some  minor  new  gener¬ 
alizations  of  earlier  work  are  included  here. 
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Paired  Comparisons 

by 


Ralph  A.  Bradley* 
Department  of  Statistics 
Florida  State  University 
Tallahassee.  FL  S2306 


1 .  Introduction 

Interest  in  paired  comparisons  in  statistics  and  psychometrics  has 
developed  in  the  contexts  of  the  design  of  experiments,  nonparametric 
statistics,  and  scaling,  including  multidimensional  scaling.  Applications 
have  arisen  in  many  areas,  but  most  notably  in  food  technology,  marketing 
research,  and  sports  competition.  An  extensive  bibliography  on  paired 
comparisons  by  Davidson  and  Farquhar  (1976)  contains  some  400  references. 

Paired  comparisons  have  been  considered  in  design  of  experiments  as 
incomplete  block  designs  with  block  size  two  by  Clatworthy  (1955)  and  others. 
Scheffd  ( 1952)>deve loped  an  analysis  of  variance  for  paired  comparisons 
with  consideration  for  possible  order  effects  for  the  two  treatments 
within  blocks.  When  the  usual  parametric  models  of  analysis  of  variance 
are  imposed,  the  analysis  of  such  designs  follows  standard  methods  and 
will  not  be  discussed  here.  - - 

The  emphasis  in  this  chapter  will  be  on  paired  comparisons  as  a 
means  of  designing  comparative  experiments  when  no  natural  measuring 
scale  is  available.  The  author's  interest  in  paired  comparisons  arose 
in  consideration  of  statistical  methods  in  sensory  difference  testing. 


•The  work  of  the  author  is  supported  in  part  by  the  Office  of  Naval 
Research  under  Contract  N00014-80-C-0093.  Reproduction  in  whole  or  in 
part  is  permitted  for  any  purpose  of  the  United  States  Government. 
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When  responses  of  individuals  to  items  under  comparison  are  subjective, 
and  particularly  when  sensory  responses  to  taste,  odor,  color  or  sound 
are  involved,  evaluation  is  easier  when  the  number  of  items  or  samples 
to  be  considered  at  one  time  is  small  and  the  effects  of  sensory  fatigue 
are  minimized.  Probabilistic  models  for  paired  comparisons  may  be  devised 
to  represent  the  experimental  situation  and  permit  appropriate  data  analysis 
The  models  provide  probabilities  of  possible  choices  of  items  or  treatments 
from  pairs  of  items  and  hence  depend  on  orderings.  The  statistical  methods 
devised  are  thus  ranking  methods  and,  while  they  are  not  literally  non¬ 
par  ametric  methods,  they  are  often  so  classified. 

The  basic  paired  comparisons  experiment  has  t  treatments,  Tj,  ...,  T  , 
and  n^j  a  0  comparisons  of  Tj  with  T\,  n^  =  n^,  i  *  j,  i»  j  »  1,  ...,  t. 
For  each  comparison,  preference  or  order  is  designated  by  a^,  a^  -  1 
if  is  "preferred"  to  in  the  0th  comparison  of  T.  and  T^,  a^  3  0 

otherwise,  a. .  ♦  a..  =  1 .  In  further  definition  of  notation,  let 

Xjo  jxa 

"ij  _ 

a. .  ■  la.,  and  a.  ■  I  a,.,  the  total  number  of  preferences  for  T. . 

13  «*1  j  13 

i*i 

In  sensory  evaluations,  responses  may  be  preferences  or  attribute  order 
judgments  on  such  characteristics  as  sweetness,  smoothness,  whiteness,  etc. 
We  shall  loosely  refer  to  preference  judgments. 

Dykstra  (1960)  provides  typical  data  on  a  paired  comparisons  preference 
taste  test  involving  four  variations  of  the  same  product.  The  data  are 
summarized  in  Table  1.  Note  that  the  experiment  is  not  balanced:  n^  *  140 
n^j  ■  54,  n^  ■  57,  n2J  ■  63,  n24  ■  58,  n^  ■  0;  treatments  Tj  and  were 
not  compared.  Unbalanced  experiments  are  permissible  as  long  as  the  de¬ 
sign  is  connected:  it  is  not  possible  to  select  a  subset  of  the  treatments 


s 


Table  1 

Summary  of  Results  of  a  Taste  Test 


T  T  T  T 

1  l2  *3  *4 

ai 

D 

28  15  23 

66 

Is 

112  --  46  47 

205 

fl 

39  17 

56 

H 

34  11 

45 

such  that  no  treatment  in  the  subset  is  compared  directly  with  a  treatment 
in  the  complementary  subset.  Balanced  experiments  are  more  efficient  when 
there  is  equal  interest  in  all  treatments  and  treatment  comparisons. 

We  shall  return  to  analysis  of  the  data  of  Table  1, which  gives  values 
of  a after  discussion  of  models  for  paired  comparisons  and  establishment 
of  basic  procedures. 

This  chapter  is  organized  in  such  a  way  as  to  give  initial  attention 
to  the  analysis  of  basic  paired  comparisons  data  like  those  of  Table  1. 

Then  extensions  of  the  method  are  developed  for  factorial  treatment 
combinations  and  for  multivariate  responses,  responses  on  several  attri- 
butes  for  each  paired  comparison.  The  emphasis  is  on  the  methodology  and 
applications,  although  properties  of  procedures  are  noted  and  references 
given.  We  conclude  with  comments  on  additional  methods  of  analysis. 
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2.  Models  for  Paired  Comparisons 

When  t  e  2,  a  paired  comparisons  experiment  with  treatments  T,  and 
T2  might  be  modelled  as  n^2  >  0  independent  Bernoulli  trials  with  proba¬ 
bilities  of  choices  for  Tj  and  T2  being  ffj  and  ir^  *  0,  i  *  1,  2, 

♦  *2  a  1*  Then  in  sane  sense  and  ir2  are  measures  of  "worth"  of 
Tj  and  T^.  Binomial  theory  applies  and  the  sign  test  may  be  used  to 
test  the  hypothesis,  Hg:  ffj  a 

Bradley  and  Terry  (1952a)  proposed  a  basic  model  for  paired  compari¬ 
sons,  extended  by  Dykstra  (1960)  to  include  unequal  values  of  the  n^. 

The  approach  was  a  heuristic  extension  of  the  special  binomial  when 
t  »  2.  Treatment  parameters,  itj,  ...»  *t,  i  0,  i  •  1,  ....  t,  are 
associated  with  the  t  treatments,  Tj,  ...»  Tt«  It  was  postulated  that 
these  parameters  represent  relative  selection  probabilities  for  the 
treatments  so  that  the  probability  of  selection  of  T^  when  compared 
with  Tj  is 

P(VT.)  »  «i/(»i*»j),  i  *  i»  j  *  1,  ...»  t.  (2.1) 

Since  the  right-hand  member  of  (2.1)  is  invariant  under  change  of  scale, 
specificity  was  obtained  by  the  requirement  that 

t 

I  *.  ■  X.  (2.2) 

i»l  1 

The  model  proposed  imposes  structure  in  that  the  most  general  model  might 
postulate  binomial  parameters  and  ir^  ■  1  -  for  comparisons  of 
and  so  that  the  totality  of  functionally  independent  parameters 
is  (2)  rather  than  (t-1)  as  specified  in  (2.1)  and  (2.2). 
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The  basic  model  (2.1)  for  paired  comparisons  has  been  discovered 
and  rediscovered  by  various  authors.  Zermelo  (1929)  seems  to  have  pro¬ 
posed  it  first  in  consideration  of  chess  competition.  Ford  (1957)  pro¬ 
posed  the  model  independently.  Both  Zermelo  and  Ford  concentrated  on 
solution  of  normal  equations  for  parameter  estimation  and  Ford  proved 
convergence  of  the  iterative  procedure  for  solution. 

The  model  arises  as  one  of  the  special  simple  realizations  of  more 
general  models  developed  from  distributional  or  psychophysical  approaches. 
Bradley  (1976)  has  reviewed  various  model  formulations  and  discussed  them 
under  categories  —  linear  models,  the  Lehmann  model,  psychophysical 
models,  and  models  of  choice  and  worth. 

David  (1963,  Section  1.3)  supposes  that  has  "merit"  V.,  i  «  1,  ...,  t, 
when  judged  on  sense  characteristic,  and  that  these  merits  may  be  represented 
on  a  merit  scale.  He  defined  "linear"  models  to  be  such  that 

PO^j)  •  H(V4-Vj),  (2.3) 

where  H  is  a  distribution  function  for  a  symmetric  distribution, 

H(-x)  =  1  -  H(x) .  Model  (2.1)  is  a  linear  model  since  it  may  be  written 
in  the  foxm, 

P(T.-*T.)  »  %  /  sech  y/2  dy  »  (2.4) 

1  3  -(log  Vj-lOg  IT.)  1  3 

as  described  by  Bradley  (1953)  using  the  logistic  density  function. 

Thurstone  (1927)  proposed  a  model  for  paired  comparisons,  that  is 
also  a  linear  model,  through  the  concept  of  a  subjective  continuum,  an 
inherent  sensation  scale  on  which  order,  but  not  physical  measurement. 
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could  be  discerned.  Mosteller  (1951)  provides  a  detailed  formulation 

and  an  analysis  of  Thurstone's  important  Case  V.  With  suitable  scaling, 

each  treatment  has  a  location  point  on  the  continuum,  say  ik  for  T^, 

i  ■  1,  ...»  t.  An  individual  is  assumed  to  receive  a  sensation  X^  in 

response  to  T\ ,  with  responses  normally  distributed  about  Uj*  When 

an  individual  compares  T\  and  T\ ,  he  in  effect  is  assumed  to  report  the 

order  of  sensations  X^  and  X^  which  may  be  correlated;  X.  >  may  be 

associated  with  T,  •»  T..  Case  V  takes  all  such  correlations  equal  and 
i  3 

the  variances  of  all  X^  equal.  The  probability  of  selection  may  be 
written 


2 

P(T.-»T.)  »  P(X.>X.)  *  -i-  /  emy  /2dy.  (2.5) 

3  3  fit  -(llj-Pj) 

It  is  apparent  from  (2.4)  and  (2.S)  that  the  two  models  are  very  similar. 

The  choice  between  the  models  is  much  like  the  choice  between  logits  and 
probits  in  biological  assay.  The  use  of  log  ir^  as  a  measure  of  location 
for  T^  in  the  first  model  is  suggested. 

Models  (2.4)  and  (2.5)  give  very  similar  results  in  applications . 
Comparisons  are  made  by  Fleckenstein,  Freund  and  Jackson  (1958)  with 
test  data  on  comparisons  of  typewriter  carbon  papers.  In  general,  more 
extensions  of  model  (2.4)  exist  and  we  shall  use  that  model  in  this  chapter. 

3.  Basic  Procedures 

The  general  approach  to  analysis  of  paired  comparisons  based  on  the 
model  (2.1)  is  through  likelihood  methods.  On  the  assumption  of  independent 
responses  for  the  n^  comparisons  of  and  T^,  the  binomial  component 
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s 

i 

i 

i 


4 


4 


4 


of  the  likelihood  function  for  this  pair  of  treatments  is 


t 


ties  or  no  preference  judgments  not  being  permitted.  The  complete  like¬ 
lihood  function,  on  the  assumption  of  independence  of  judgments  between 
pairs  of  treatments,  is 


&•  n<  • 

L  a  n  n.1/  n  (*.♦*.)  (3.1) 

i  1  i<j  1  3 

It  is  seen  that  at  constitute  a  set  of  sufficient  statistics 

for  the  estimation  of  ....  and  that  a^  is  the  total  number  of 
preferences  or  selections  of  i  *  1,  ...,  t,  for  the  entire  experiment. 

3.1.  Likelihood  Estimation 

ML  estimators,  pA  for  ir.f  i  *  1,  ...»  t,  are  obtained  through  maxi¬ 
mization  of  log  L  in  (3.1)  subject  to  the  constraint  (2.2).  After  minor 
simplifications,  the  resulting  likelihood  equations  are 


a. 

i 


l 

3 


JliL. 

VPj 


0,  i 


1. 


•  ••# 


t. 


(3.2) 


and 


I  P,  -  1.  (3.3) 

i  1 


Solution  of  equations  (3.2)  and  (3.3)  is  done  iteratively, 
is  the  k**1  approximation  to  p4 , 

a 


V  l  [nij/(pi  1)+pj  1))]-  k 
jii 


1.  2, 


Hie  iteration  is  started  with  initial  specification  of  the  p!r  * ;  one  may 
take  p^  a  1/t,  i  **  1,  . ..,  t,  and  this  is  adequate  although  Dykstra 
(1956,  1960)  has  suggested  better  initial  values. 

We  return  to  the  example  of  Table  1.  Values  of  a.  are  given  in 
the  table  and  values  of  n^  precede  the  table.  Solution  of  equations 
(3.2)  and  (3.3)  was  begun  with  ■  1/4,  i  *  1,  ....  4.  Results  for 
initial  iterations  are  summarized  in  Table  2  along  with  final  values 
for  p± ;  typically  approximately  10  iterations  are  sufficient  for  four- 
decimal  accuracy  in  the  final  values.  It  is  this  iterative  procedure 
that  Ford  (1957)  has  shown  to  converge.  The  procedure  is  easy  to  program 
on  computers  because  of  the  symmetry  of  the  equations  to  be  solved. 
Bradley  and  Terry  (1952a)  and  Bradley  (1954a)  have  provided  tables  giving 
values  of  the  pA  for  equal  values  of  the  n^  *  n,  t  *  3,  n  *  1,  ...,  10; 
t  *  4,  n  ■  1,  ...,  8;  t  •  5,  n  »  1,  ...»  5. 

In  small  experiments,  small  values  of  the  ,  perhaps  with  poorly 
selected  treatments,  the  estimates  p^  may  define  a  point  on  a  boundary 
of  the  parameter  space.  These  situations  may  be  recognized  from  tables 
like  Table  1  and  require  special  consideration.  As  an  example,  refer 
to  Table  1  and  suppose  that  T2  and  Tj  are  always  preferred  to  Tj  and  T^ 
and  Table  1  is  unchanged  otherwise.  Then  Sj  ■  23,  a2  *  244,  a3  ■  71  and 
aA  ■  34.  Treatments  T,  and  Tj  dominate  Tj  and  T 4  and  information  on  the 
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Table  2 

Values  of  the  Estimators  in  the  Iterative  Solution 


B 

EB 

EB 

p|S) 

Pi 

fl 

.25 

.1371 

.1188 

.1137 

.1112 

.1101 

.1082 

.25 

.4094 

.4656 

.4918 

.5049 

.5131 

.5193 

.25 

.2495 

.2413 

.2357 

.2327 

.2290 

.2294 

B 

.25 

.2040 

.1743 

.1588 

.1512 

.1478 

.1431 

relative  values  of  T 2  and  T3  comes  only  from  the  direct  comparisons  of 

T2  and  Tj.  It  follows  that  Pj  *  0,  p2  ■  46/63  *  .7302,  p3  *  17/63  ■  .2698, 

and  p4  »  0.  But  there  is  also  information  on  the  relative  values  of  Wj 

and  tt4>  We  find  Pj/P4  *  23/34  «  .4035/. 5965  and  can  write  Pj  *-  .40356 

and  p4  =  .59656,  6  infinitesimal.  A  formal  analysis  may  be  conducted 

through  minimization  of  log  L  with  respect  to  ir*,  ir2,  w3,  it*,  ^  +  n3  * 

ir*  +  if*  =  1,  where  ir.  *  6ir*  k.  «  6ir*  and  6  is  small.  Indeed,  the  maximum 
14  114  4 

value  of  log  L  may  be  found  in  this  way  and  it  is  needed  in  the  compu¬ 
tation  of  likelihood  ratios  as  discussed  below.  Bradley  (1954a)  provides 
additional  discussion  of  these  special  boundary  problems,  problems  not 
usually  encountered  in  applications. 

3.2.  Tests  of  Hypotheses 

(i)  The  major  test  proposed  by  Bradley  and  Terry  (1952)  was  that 
of  treatment  preference  or  selection  equality.  The  null  hypothesis  is 

IV  *1  "  *2  **•••*  fft  a  1/1  C3-4) 

and  the  general  alternative  hypothesis  is 


H  :  it.  *  ir.  for  some  i,  j,  i  *  j,  i,  j  »  1,  ...» 
a  1  j 


t. 


(3.5) 


If  we  designate  the  likelihood  ratio  as  Xj,  it  is  easy  to  show  that 


-2  log  X.  ■  2N  log  2  -  2B. ,  N  ■  £  n. . , 
4  4  i<j 


B 


1 


(3.6) 


For  large  ru  ^ ,  -2  log  has  the  central  chi-square  distribution  with 
(t-1)  degrees  of  freedom  under  HQ.  Values  of  B^,  together  with  exact 
significance  levels,  were  provided  with  the  cited  tables*  of  estimators 
Pj..  Comparison  of  significance  levels  for  the  large-sample  test  with 
small-sample  exact  significance  levels  in  the  tables  suggests  that  the 
former  may  be  used  for  modest  values  of  the  n. ^ ,  a  situation  perhaps 
comparable  to  use  of  the  normal  approximation  to  the  binomial. 

For  the  values  of  the  a^  of  Table  1,  the  noted  values  of  the  n.^ 
above  that  table,  N  *  372,  and  the  values  of  the  p^  in  Table  2,  we  have 
Bj  =  206.3214  and  -2  log  «  103.06  with  3  degrees  of  freedom.  There 
is  a  clear  indication  that  the  are  not  equal  and  that  treatment  prefer- 
ences  differ. 

(ii)  It  is  always  encumbent  on  statisticians  to  check  the  validity 
of  models  used  in  statistical  analyses  when  possible.  We  have  noted 
above  that  a  general  "multi-binomial"  model  with  (!p  functionally  inde¬ 
pendent  parameters  ir^  may  be  posed  that  ignores  the  structure  of  paired 
comparisons  in  the  sense  that  the  same  treatment  is  compared  with  more  than 


•Common  logarithms  were  used  to  compute  B.  in  these  tables.  In 
this  paper,  natural  logarithms  aro  used  throughout. 
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one  other  treatment.  The  multi-binomial  model  fits  the  data  of  tables 
like  Table  1  perfectly.  This  permits  a  test  of  the  more  restrictive 
model  of  (2.1). 

The  following  likelihood  ratio  test  was  proposed  by  Bradley  (1954b) 
and  extended  by  Dykstra  (1960).  Consider  the  null  hypothesis, 

H0;  B  Wj/OTj+ifj),  i  *  j,  i,  j  *  1.  ...,  t,  (3.7) 

and  the  alternative  hypothesis, 

H  :  it..*  n./(w.*v.)t  for  some  i,  j,  i  *  j.  (3.8) 

a  x}  l  l  j 

Under  H&,  the  likelihood  estimator  of  is  p^  *  aij^ni j  w^en  njj  >  0 
and  the  estimator  is  not  needed  when  n^  ■  0.  Under  HQ,  p^  is  the  esti¬ 
mator  of  from  equations  (3.2)  and  (3.3).  Designating  *2  83  the  likeli¬ 
hood  ratio  statistic,  we  have 

-2  log  A  *  2(  l  a.,  log  a..  -  l  n. .  log  n..  ♦  B.).  (3.9) 

c  i*j  i<j  L 

For  large  n^,  -2  log  is  taken  to  have  the  chi-square  distribution  with 
(2)  -  (t-1)  «  *j(t-l)(t-2)  degrees  of  freedom  under  HQ.  An  alternative 
statistic,  asymptotically  equivalent  to  that  of  (3.9),  is 

X2  *  l  (a.. -a*  )2/aj. ,  (3.10) 

i*j 

where  ajL  «  nijPj/(Pi*Pj)  aij  a  "ij^ij*  7,118  alternate  *om  “ay  be 
rewritten, 

x  ■  J.nij{Pij  -  [Pi/(Pi+Pj)l>  /tPi/Cp^Pj)]- 


(3.11) 
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Dykstra  has  noted  that  the  test  statistics  may  be  distorted  when  some 
are  small.  Since  there  is  no  basis  for  pooling  terms  in  this  case,  he 
suggested  omitting  terms  in  (3.11)  with  very  small  values  of  n^  (and 
hence  n^)  and  deleting  one  degree  of  freedom  for  each  pair  of  terms  so 
deleted. 

For  the  data  of  Table  1,  nJ4  *  0  and  the  tests  for  the  fit  of  the 
model  have  %(3)(2)  -1*2  degrees  of  freedom.  From  (3.9),  -2  log  *2  *  2.02 
and  there  seems  to  be  no  reason  to  doubt  the  appropriateness  of  the  model 
(2.1).  The  statistic  in  (3.10)  is  evaluated  also  for  illustrative  pur¬ 
poses.  Values  of  the  a! .  are  given  in  Table  3  and  they  may  be  compared 

2 

directly  with  the  values  of  a_  in  Table  1.  Computation  yields  x  °  2.00; 
the  close  agreement  of  the  two  confutations  is  typical. 


Table  3 

Estimated  Frequencies  for  the  Data  of  Table  1 


T2 

T3 

Row 

Sums 

D 

- 

24.14 

17.31 

24.54 

65.99 

115.86 

- 

43.70 

45.47 

205.03 

0 

36.69 

19.30 

- 

- 

55.99 

□ 

32.46 

12.53 

- 

- 

44.99 

In  the  author's  fairly  extensive  experience  in  fitting  model  (2.1) 
to  data  in  food  technology  and  consumer  testing,  the  model  is  usually 
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found  to  fit  well.  When  the  model  does  not  fit,  one  or  more  treatments 
are  often  found  to  possess  a  characteristic  not  found  in  the  others, 
possibly  leading  to  preference  judgments  influenced  by  this  attribute 
when  such  treatments  are  in  a  comparison. 

(iii)  In  some  uses  of  paired  comparisons,  responses  may  be  obtained 
for  several  demographic  groups,  under  different  evaluation  conditions, 
or  other  criterion  for  grouping  responses.  The  possibility  of  group 
by  treatment  interaction  or  preference  disagreement  arises  and  this  may 
be  tested. 

Let  u  *  1,  ...,  g  index  groups  of  responses  in  paired  comparisons, 
let  irV  be  the  treatment  parameter  for  in  group  u,  and  suppose  that 
sufficient  comparisons  axe  made  within  each  group  to  obtain  pY,  the 
estimator  of  irY,  i  •  1,  ...,  t.  Interest  is  in  the  hypotheses, 

Hq:  irV  a  lTi,  i  •  1,  ...»  t;  u  *  1,  ...»  g,  (3.12) 

and 


H  :  vY  *  ir.  for  some  i  and  u. 
a  l  i 

The  likelihood  ratio  test  depends  on 


(3.13) 


-2  log  \5  -  2(BX  -  ^Blu), 

where  B^u  is  computed  from  (3.6)  for  the  data  within  group  u  and  B^  is 
computed  similarly  for  the  pooled  data  from  all  of  the  groups.  For  large 
values  of  the  n.^u,  the  number  of  comparisons  of  T^  and  T^  in  group  u, 

-2  log  Aj  has  the  central  chi-square  distribution  with  (g-l)(t-l)  degrees 
of  freedom  under  HQ  of  (3.12). 
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An  omnibus  test  of  treatment  equality  may  be  described: 

H0:  "  1/t,  i  ■  1,  ...»  t;  u  ■  1,  ...»  g, 

H  :  *  1/t  for  some  i  and  u, 

a  X 

-2  log  J  -  2N  log  2  -  2  ?  B,  ,  N  .  f  N  -  £  J  n  . 

*  u*l  1U  u»l  u  u  i<j  1JU 

The  test  statistic  is  taken  to  have  the  chi-square  distribution  with 
g(t-l)  degrees  of  freedom  under  HQ.  An  analysis  of  chi-square  table 
may  be  formed:  -2  log  A^  *  -2  log  Aj  -  2  log  Aj,  where  -2  log  A^  is 
the  test  statistic  of  (3.6)  based  on  the  pooled  data. 

Bradley  and  Terry  (1952a)  gave  a  small  example  for  two  tasters 
evaluating  poTk  Toasts  from  hogs  with  differing  diets,  t  *  3,  g  ■  2, 
n_u  ■  5  for  all  i,  j,  u,  i  *  j.  The  data  are  suranarized  in  Table  4 
and  Table  5  is  the  analysis  of  chi-square  table.  The  large  total 
treatment  effect  is  seen  to  be  due  to  disagreement  of  the  two  judges 
on  preferences. 


Table  4 

Roast  Pork  Preference  Data  for  Two  Judges 


Diet 

J 

Judge  1 

Judge  2 

Pooled  Data 

n 

a(1) 

ai 

a<2> 

l 

pf5 

a. 

i 

i 

Pi 

i 

.0526 

7 

.5324 

8 

.2479 

2 

.4737 

5 

.2993 

12 

.4268 

3 

.4737 

3 

.1683 

10 

.3253 

Bn  ■ 

6.7166 

Bj2  •  9.2895 

BiB 

20.2565 

IS 


Table  S 

Analysis  of  Chi  Square.  Roast  Pork  Data 


Test 

Statistic 

WSM 

2 

X 

Treatments,  given  agreement 

-2  log 

2 

1.07 

Judge  by  Treatment  Interaction 

-2  log  x3 

2 

8.50 

Treatments 

-2  log  X4 

4 

9.58 

(iv)  Tests  for  specified  treatment  contrasts,  contrasts  on  the  log 
ir^,  nay  be  made  by  the  method  of  Section  5. 

Bradley  and  Terry  (1952a)  proposed  one  additional  test.  It  was  assumed 
that  the  treatments  fell  into  two  groups,  say  T. ,  ....  T$  and  T^j,  ....  Tt, 
with  Wj  *  ...  ■  *  w  and  ■  ...  *  ■  (l-sir)/(t-s).  The  test  is  of 

the  equality  of  *  and  (l-s*)/(t-s),  or  equivalently  of  *  1/t,  i  »  1,  ....  t, 
against  the  two-group  alternative  of  the  assumption.  Hie  reader  is  referred 
to  the  reference  for  details. 

3.3.  Confidence  Regions 

Large-sample  theory  may  be  used  to  obtain  variances  and  covariances 
for  the  estimators  Pj,  ....  pt  or  their  logarithms  in  paired  comparisons . 
Bradley  (1955)  considered  this  theory  with  each  n^  ■  n  and  Davidson  and 
Bradley  (1970) .  considering  the  multivariate  model  discussed  in  Section  6 
obtained  results  for  general  n^  as  a  special  case. 

Let  ■  n^/N.  Then  ^(Pj-Wj),  ...»  »fiF(pt-»t)  have  the  singular 
multivariate  normal  distribution  of  dimensionality  (t-1)  in  a  space  of  t 
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I 

I 

I 

I 

* 

I 


I 


4 


dimensions  with  zero  mean  vector  and  dispersion  matrix  E 
that 


[0tjl  such 


°ij  *  cofactor  of  X^  in 


A  1 
1*  0 


/; 


A  1 

IV  M 
1*  0 


(3.14) 


where  £  ®  [X^] ,  is  the  t-dimensional  unit  row  vector,  and 


xu  -  ^  |  1 


1,  . . . ,  t. 


and 


(3. IS) 


2 

xij  ’  -  1  *  i-  i*  J  *  1 . *• 

In  order  to  use  these  results  in  applications,  o„  must  be  estimated; 
this  is  done  through  substitution  of  pi  for  in  (3.15)  to  obtain  the 
and  subsequent  substitution  in  (3.14)  yields  the  S^’s. 

For  the  data  of  Table  1,  values  of  Pj,  ...,  p4  in  Table  2  are  used 
to  obtain 


10.4963 

-.9558 

-1.2740 

-2.4259 

-  .9558 

.4304 

-  .3022 

-  .3553 

-1.2740 

-.3022 

.7441 

0 

-2.4259 

-.3553 

0 

3.1237 

from  whence 


Note  that  £  is  singular ,  the  row  and  column  sums  being  zero. 

Approximate  confidence  regions  may  be  obtained.  The  confidence 
interval  on  is  developed  from  the  fact  that  is  standard 

normal  for  large  N.  In  the  example,  the  .95- confidence  interval  for 
is  (.0795,  .1369).  Let  j*  be  a  vector  containing  any  subset  of  t*  distinct 
parameters  of  the  set,  t*  <  t.  The  (1-a) -confidence  region  for  these  t* 
parameters  is  that  ellipsoidal  region  of  the  parameter  subspace  for  which 


N(2*-E*)*£SV- B*)  *  (3.17) 

In  (3.17),  £*  is  the  vector  of  estimates  corresponding  to  w*,  £*  is  the 

dispersion  matrix  for  ^(p*-v*)  obtainable  from  (3.16),  and  is  the 

~  a ,  t 

(l-o)-percentage  point  of  the  central  chi-square  distribution  with  t* 
degrees  of  freedom.  As  an  example,  let  2*  ■  (tTj,  8,1,1  1,16,1 
j>*  =  (.1082,  .5193), 


~  .0800 

-.0695" 

^-1 

and  £•  ■ 

13.7441 

1.4372 

-.0695 

.6644 

_ 1.4372 

1.6553 

with  o  »  .01,  t*  «  2,  x^qi ,2  *  9.210,  it  may  be  verified  that  (3.17)  yields 
the  .99- confidence  region, 

13. 7441(1^-.  1082) 2  ♦  1.6S53(ir2-.5193)2  +  2. 8744(1^-. 1082)  (*,-.5193)  s  .0248. 
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Since  it  nay  be  appropriate  to  regard  log  ir^  as  the  location  paraneter 
for  T^,  i  »  1,  ....  t,  in  view  of  (2.4)  and  (2.5),  confidence  intervals  or 
regions  on  the  log  may  be  desired.  It  follows  that  rf?(log  Pj-log  Xj),  . .., 
^(log  pt-log  v^)  have  the  singular  multivariate  normal  distribution  with 
zero  mean  vector  and  dispersion  matrix  ()Ep .  where  £  is  the  diagonal  matrix 
with  typical  element  1/w^.  Estimated  variances  and  covariances  are  as  fol- 
lows:  est.  var.(/FT  log  pp  «  est-  covar.(»fl  log  p^,  ntf  log  p^) 

*  °ij/PiPj *  i  *  !•  Confidence  intervals  or  regions  on  the  log  tk  may  be 
obtained  analogously  to  those  shown  above  for  the  ir^.  If  a  method  of  multiple 
comparisons  is  to  be  used,  the  necessary  variances  and  covariances  may  be 
obtained  from  the  information  given. 

In  the  very  special  case  when  each  n. .  »  n,  approximate  variances  and 

*  J 

covariances  may  be  obtained  if  the  treatments  are  not  too  disparate.  Then, 
on  the  assumption  that  »  1/t,  i  ■  1,  ...»  t,  o^  ■  2(t-l)2/t3  and 
Ojj  *  -2(t-l)/t3,  i  *  j,  while  N  ■  n(j)»  Like  the  binomial  with  its  stable 
variance  for  its  parameter  in  a  middle  range,  so  are  the  variances  and 
covariances  stable  in  paired  comparisons  when  the  are  near  1/t  and  the 
n_  a  n.  This  can  reduce  computational  effort  for  balanced  experiments. 

3.4.  Asymptotic  Relative  Efficiency 

It  is  well  known  that  the  asymptotic  relative  efficiency  of  the  sign 
test  to  the  Student  test  is  2/v  when  assumptions  for  the  latter  apply  and 
appropriate  data  could  be  obtained.  Bradley  (1955)  showed  that,  under  similar 
conditions,  the  asymptotic  relative  efficiency  of  paired  comparisons  relative 
to  a  randomized  complete  block  design  with  the  same  number  of  treatment  repli¬ 
cations  is  t/*(t-l),  when  each  n^  ®  n.  This  result  may  be  adjusted  to  show 
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that  the  relative  efficiency  of  paired  cooparisons  relative  to  the  analysis 
of  variance  for  the  similar  balanced  incomplete  block  design  is  2/ir  by  the 
methods  of  Raghavarao  (1971,  Sections  4.3  and  4.S). 

While  the  asymptotic  relative  efficiency  factor  of  2/it  suggests  loss 
of  efficiency  through  use  of  the  ranking  or  preference  designations  of 
paired  comparisons,  the  method  is  usually  used  because  measurement  scales 
are  not  available  for  sensory  or  judgment  evaluations. 

4,  Extensions  of  the  Basic  Model 
4.1.  Adjustments  for  Ties 

The  basic  paired  comparisons  experiment  forces  decision  on  the  part 
of  the  respondent  and  data  like  those  of  Table  1  result.  Nevertheless, 
ties  or  "non-selection"  judgments  often  arise,  for  example,  in  consumer 
testing. 

The  treatment  of  ties  in  the  sign  test  has  received  considerable  at¬ 
tention.  Hemelrijk  (1952)  demonstrated  that  the  most  powerful  test  of 
significance  was  obtained  by  omission  of  ties  and  use  of  a  conditional 
binomial  test  on  the  sample  results  so  reduced.  But  the  treatment  of  ties 
must  depend  on  experimental  objectives,  see  Gridgeman  (1959),  and  estimation 
of  potential  share  of  a  consumer  market  surely  must  require  other  consid¬ 
erations.  Decisions  for  paired  comparisons  must  be  similar  to  those  for  the 
sign  test.  Two  formal  methods  for  the  treatment  of  ties  in  paired  compari¬ 
sons  are  available. 

Rao  and  Kupper  (1967)  introduced  a  parameter  6  £  1  and  adjusted  proba¬ 
bilities  associated  with  the  comparison  of  T^  and  Tj  to  obtain 
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PCTj-Tj) 


OB 

h  I 

-(lOg  IK-lOg  ir,.)*n 


sech 


y/2  dy. 


PCT^Tj)  •  (e2-l)irinj/(iri*0ir^)(eiri+ir^)  (4.1) 

-(log  it. -log  ir.)*n  - 
»  /  ■'  secli  y/2  dy,  i  *  j, 

-(log  itj-log  tr^)-n 


where  n  *  log  6.  It  is  seen  that  the  model  extends  the  linear  model  of 
(2.4)  and  that  log  6  is,  in  a  sense,  a  threshold  parameter  associated  with 
discriminatory  ability. 

Rao  and  Kupper  extended  the  theory  in  parallel  with  that  given  above. 

Unfortunately,  they  assumed  that  n^  «  n,  but  the  work  is  easily  extended. 

We  summarize  only  the  results  leading  to  the  test  of  treatment  equality, 

although  they  provide  other  asymptotic  results  including  variances  and  co- 

variances  for  their  estimators.  We  use  our  notation.  Let  N  •  J  n. .  and 

i<j 

b^  be  the  sum  of  the  number  of  ties  and  the  number  of  preferences  for  T\ 

in  the  n..  comparisons  of  T.  and  T..  Lot  b.  *  T  b, .  and  let  brt  be  the 

ij  i  J  l  v  ij  0 

j*i 

total  number  of  ties  in  the  experiment.  The  likelihood  equations  are: 


_i  _  T  'ij 

;  p^fli 


ij 


j  Pi+0Pj 


y  Ju- 

j  QPi+Pj 


0j  1  ®  1|  • • • i  t| 


I  Pi  -  1, 


V..,,  J  ^i£L 

8-1  i*j  Pj+6Pj 


0, 


(4.2) 


where  is  the  estimator  of  ir^  and  8  of  8.  The  likelihood  ratio  test  of 
V  ffi  s  i  *  1»  ...»  t,  versus  Ha:  *  1/t  for  some  i»  leads  to  the 
statistic* 


-2  log  X*  -  2N  log  2N  -  2b0  log  2b0  -  2(N-b0)log(N-b0>  -  2B*.  (4.3) 


where 


BJ  a  I.  by  logCp^SPj)  -  I  bi  lo8  P4  “  b0  log(02-l).  (4.4) 


Again,  for  large  N  and  under  Hfl,  -2  log  X|  has  the  central  chi-square  distri 
bution  with  (t-1)  degrees  of  freedom.  An  iterative  solution  of  equations 
(4.2)  is  suggested  by  Rao  and  Kupper.  They  provided  also  a  test  of  the 
hypothesis,  e  ■  eQ,  against  the  alternative,  0  *  0Q. 

Davidson  (1970)  proposed  probabilities  corresponding  to  those  of  (4.1) 


as 


P(T.-*T^)  »  V(VVv^ 

and  (4.5) 

P(Ti=Tj)  ■  V/UiTT^ / (Tri+irj+v^Tri7rj )  » 

v  2  0.  This  model  preserves  the  odds  ratio,  P(Ti-»T^)/P(T^-*'Ti)  ■ 
consistent  with  the  Luce  (1959)  choice  axiom.  In  addition,  the  probability 
of  a  tie  is  a  maximum  when  ir^  ®  tt  and  diminishes  as  and  differ,  an 
intuitively  desirable  effect. 

Let  b£j  be  the  sum  of  the  number  of  ties  and  twice  the  number  of 
preferences  for  T^  in  the  n^  comparisons  of  Tj  and  T^  and  let  bJ  ■  I  bjj. 
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Davidson's  likelihood  equations  are 


b*  ^  _  a  _ 

j— -  I  nij(2*C/pj/pi)/(Pi+p.+C/piPj)  =  0,  i  ■  1,  ....  t, 

}*i 


A 

V 


J. 


0. 


(4.6) 


where  p.  is  the  estimator  of  ir.  and  v  of  v.  The  likelihood  ratio  statistic 

ri  l 

corresponding  to  (4.3)  is  of  the  same  form  with  B*  replaced  by 


B**  * 


J.  nij  1°8(Pi+Pj^/PiPp  -  %  I  Pi  -  b0 


b„  log  v. 


(4.7) 


Davidson  also  proposed  an  iterative  solution  for  the  equations  (4.6)  and 
examined  large-sample  theory.  He  showed  that  the  Rao-Kupper  test  and  the 
Davidson  test  for  treatment  equality  are  asymptotically  equivalent. 

The  choice  between  the  two  methods  for  extending  the  basic  paired 
comparisons  model  to  a  model  allowing  for  ties  seems  to  be  a  matter  of 
intuitive  appeal.  Both  give  very  similar  results  in  applications. 


4.2.  Adjustments  for  Order 

In  paired  comparisons,  there  is  often  concern  for  the  effects  of 
order  of  presentation  of  the  two  items  in  a  pair.  Experiments  are  often 
conducted  so  that,  for  each  pair  of  treatments,  each  order  of  presentation 
is  used  equally  frequently  in  an  effort  to  "balance  out"  the  effects  of 
order.  Scheffd  (1952)  addressed  this  problem  in  the  analysis  of  variance. 
Beaver  and  Gokhale  (1975)  extended  our  basic  model  to  allow  for  order  effects. 
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Davidson  and  Beaver  in  an  undated  manuscript  describe  the  Beaver-Gokhale 
model  as  having  additive  order  effects  and  discuss  also  a  model  with  multi¬ 
plicative  order  effects  suggested  by  Beaver  (1976).  For  the  ordered  pair 
(T\ ,  T\),  Beaver  and  Gokhale  defined 


IT  •  ^6  •  ■  IT  >  —  6  i  i 

•  -(T.+T.)  *  - -M.  ,  P.  .(T.+T.)  =  -3—^1 
13  i  y  1]  3  i 


and,  for  the  ordered  pair  (T.,  T^), 


ir. -6.  . 


ir.+fi. 


P..(T.vrO  =  — — ,  P . .  (T .-»T . )  =  -2—11 

Ji  i  y  jiv  ]  X1 


(4.8) 


(4.9) 


The  corresponding  probabilities  for  the  model  with  multiplicative  order 
effects  are 


0..1T,  IT. 

P.  .(T.-*T  )  =>  =-3—= —  P  fT  -»-T  1  =  _ 3 _ 

ir  i  r  9ij1Ti+,rj  ir  j  iJ  eijV,rj  ' 


P«(W  8  7  ♦I1-  n  •  •  P«<W  =  . 

jx  x  j  wi*  ijffj  J1  J  1  ^i^^ij^j 


(4.10) 


The  model  given  by  (4.8)  and  (4.9)  requires  that  l^jl  *  max(iK,  tk),  an 
awkward  feature,  while  the  model  (4.10)  only  requires  that  9^  >  0.  Ad¬ 


vantages  of  the  multiplicative  model  (4.10)  are: 

i 

(i)  Preference  probabilities  depend  on  the  worth  parameters  ir^  and 
iK  only  through  the  ratio 

(ii)  Model  (4.10)  admits  a  sufficient  statistic  whose  dimension  is 
that  of  the  parameter  space. 


I 
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(iii)  Model  (4.10)  is  a  linear  model  and,  for  example. 


P. .(T.VT.)  =  h  /  sech2y/2  dy. 

3  3  -(log  n\-log  ir^-log  0^ j 

For  these  reasons,  we  limit  further  discussion  to  (4.10). 

Explicit  methodology  for  model  (4.10)  and  its  special  cases  does  not 
appear  in  the  statistical  literature,  although  it  is  implied  by  Davidson 
and  Beaver.  Various  likelihood  ratio  tests  and  associated  estimation  pro¬ 
cedures  can  be  developed  easily  when  needed.  We  consider  only  the  special 
case  when  6^  *  0  for  all  i  *  j.  Then  the  likelihood  equations  are 

a,  n..§  n.. 

—  -  l  — — 3 - l  — h — ■  o,  i  3  i,  ....  t. 

Pi  j  j  (P?*ep$) 

1  j*i  1  3  j*i  1  3 

l  P?  *  1.  (4.11) 

i  1 


f 

A 

0 


n.  .p? 
■-1Q- 


l  _ 

i*j  (0p?+pj) 


0, 


where  f  is  the  total  number  of  preferences  for  the  first  presented  item 
of  a  pair,  p|  is  the  estimator  of  ir^  and  0  of  0,  while  n^  is  the  number 
of  judgments  on  the  ordered  pair  (T^,  T^)  and  n^  is  the  number  of  judgments 
on  the  ordered  pair  (T^,  T^).  The  likelihood  ratio  statistic  for  HQ:  =  1 

i  *  1,  ...,  t,  versus  H  :  ir.  *  1/t  for  some  i  in  the  presence  of  an  order 
effect  is 


-2  log  X *  =  2N  log  N  -  2f  log  f  -  2(N-f)log(N-f)  -  2B* 


(4.12) 
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where 


bT  *  l  n. .  log(8p*4p$)  -  l  a.  log  p?  -  f  log  0. 
1  i*j  13  13  i  1  1 


(4.13) 


Again,  under  HQ,  -2  log  Aj  has  the  central  chi-square  distribution  with 
(t-1)  degrees  of  freedom.  A  test  for  the  presence  of  a  common  order  effect, 
Hq:  e  =  1  versus  Hg:  e  *  1,  follows  immediately.  For  this  test. 


-2  log  X4  »  2(B1-B1) 


(4.14) 


has  the  central  chi-square  distribution  with  1  degree  of  freedom  when  0=1. 


In  (4.14),  Bj  is  taken  from  (3.6). 


Other  tests  could  be  developed.  One  of  interest  is  the  test  for  a 

common  order  effect:  H_:  0. .  =  0  for  all  i  *  j,  H  :  0. .  *  0  for  some  i,  j, 

0  lj  a  lj 


i  *  j.  Such  a  test  could  be  described  as  a  test  of  order  by  treatment 
pair  interaction. 

Note  that  neither  model  for  order  effects  suggests  that  an  effort  to 
balance  out  the  effects  of  order  is  exactly  right.  Note  also  that  both 
order  effects  and  ties  could  be  important  and  this  is  the  situation  addressed 
by  Davidson  and  Beaver  in  their  unpublished  manuscript. 


4.3.  A  Bayesian  Approach 

Davidson  and  Solomon  (1973)  considered  a  Bayesian  approach  to  the 
estimation  of  the  worth  parameters  . ..,  of  paired  comparisons. 


Let  a  =  [a. .]  and  n°  =  [n?.],  n?.  =  a?.  =  0,  n?.  =  n? , . 

s  1  ~  1  ijJ*  li  li  *  ij  ji 


a  conjugate  prior  distribution  for  the  parameters. 


They  formulated 
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a0  a0  n° 

♦  (2)  *  A  (a0,  S°)  n  ij*  2  6  n»  (4. IS) 

i<j  J  J 

a0  n° 

=  A(S°,  u°)  n  w. i/  n  (*.♦*.)  ij, 

i  1  i<j  1  J 

where  0  =  (ir:  it.  i  0,  i  »  1,  • . .,  t,  J  ir.  =  1>.  They  restricted  attention 
1  i  1 

to  densities  (4. IS)  for  which  a^\  z  0  and  a?^  ♦  a^  *  n?^.  They  noted  that, 
even  with  these  restrictions,  each  (a0,  n°)  determines  a  distinct  prior 
distribution  and  that  the  family  of  priors  can  represent  a  wide  spectrum 
of  prior  beliefs.  Davidson  and  Solomon  suggested  that  the  experimenter 
think  of  his  prior  beliefs  in  terms  of  a  conceptual  experiment  with  n?j 
responses  to  the  pair  (T^  T^)  with  a?^  of  them  being  preferences  for  T^. 
Choice  of  n?^  is  to  be  made  as  a  measure  of  the  strength  of  the  experi¬ 
menter's  beliefs  on  the  pair  (T^,  T^) . 

It  is  noted  that  the  selection  of  an  estimator  for  the  vector  of  worth 
parameters  jr  is  of  central  interest.  This  is  to  be  done  on  the  basis  of 
the  prior  distribution  (4.15)  and  the  results  of  experimentation  summarized 
in  the  likelihood  function  conditioned  on  it, 

Usln)  -  n  Tt.1  n  (”ij )(*.♦*.)  ”ij.  (4.16) 

i  i<j  ij  J 

The  estimator  of  £  can  be  used  to  estimate  pairwise  preference  probabilities 
or  to  provide  a  ranking  of  the  items  or  treatments  in  the  experiment. 

One  estimator  of  n  is  the  mode  p*  of  the  posterior  distribution  of  it. 
This  mode  is  shown  to  be  the  solution  of  the  set  of  equations. 
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y  _2i_ 


o,  i 


1. 


•  •  •  * 


t. 


(4.17) 


where  n!^  *  n?^  ♦  n^  and  a|  =  a?  +  a^,  i  <  j,  i,  j  »  1,  . .. ,  t.  It  is 
seen  that  the  choice  of  prior  distribution  led  to  a  natural  combination 
of  prior  and  experimental  information  as  seen  from  the  definitions  of  n! . 
and  a!.  Further,  equations  (4.17)  have  the  form  of  equations  (3.2)  and  (3.3). 

Davidson  and  Solomon  considered  also  the  Bayes  estimator  of  w  under 
a  quadratic  loss  function,  namely  p,  the  mean  of  the  posterior  distribution 
of  ir.  While  they  did  not  obtain  a  closed  expression  for  p,  they  did  show 
that,  if  n! .  *  n*  for  all  i  <  j,  the  rankings  determined  by  p*  and  p  are 

1 J  M 

identical  with  the  Bayes  ranking  determined  by  the  posterior  score  a' . 


4.4.  Triple  Comparisons 

The  basic  model  for  paired  comparisons  can  be  extended  to  triple  compari¬ 
sons  in  at  least  two  ways.  Bradley  and  TerTy  (1952b)  proposed  the  model, 

Pffi+Tj-*^)  =  (4.18) 

for  comparison  of  T.,  T.  and  T.  in  a  triplet,  i  *  j  *  k,  i,  j,  k  «*  1,  ...,t. 

1  j  K 

Pendergrass  and  Bradley  (1960)  proposed  the  model, 

Pd^T.^)  =  (4.19) 

In  both  models,  the  ir*s  may  again  be  regarded  as  worth  parameters  with  T  it.  *  1. 

X 

Both  models  have  some  desirable  properties  as  discussed  in  the  second  reference. 
Model  (4.18)  is  consistent  with  the  Luce  choice  axiom  and  can  be  written  as 
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a  Lehmann  model  (see  Bradley  (1976)).  Model  (4.19)  has  the  property  that 


the  set  of  treatment  rank  suds  constitutes  a  set  of  sufficient  statistics 
for  the  estimation  of  ,  . ...  Basic  methodology  for  the  second  model 
is  well  developed  including  estimation  procedures,  tests  of  hypotheses 
including  goodness  of  fit,  and  asymptotic  theory. 

We  show  only  the  estimating  equations  and  the  basic  test  for  model 
(4.19).  If  p1#  ....  pt  are  the  estimators  of  ir  ,  ...»  *t,  they  result 
from  solution  of  the  equations, 

ai  r  ,  _  . 

—  -  l  - n"  '  7n\ — - ■  O'  1  1,  ...»  t. 


Pi  j<k 
j  ,k*i 


Dijk(E5 


(4.20) 


I  p4  ■  i. 


where 


Dijk(p)  *  P^Pj^k5  +Pj(Pi*Pk)  ^Pk^Pj5 


(4.21) 


and  is  the  number  of  repetitions  or  rankings  on  the  triplet  CT,  I \,  T^), 

i  <  j  <  k.  The  quantity  a.  in  (4.20)  is  such  that  a.  *  3  £  n, R. , 

1  1  j<k  1JK  1 

j,k*i 

where  Ri  is  the  total  sum  of  ranks  for  in  the  experiment.  Pendergrass 
and  Bradley  suggest  iterative  means  of  solution  of  the  equations  (4.20) 
although  they  held  each  n.^  *  n  for  all  i  <  j  <  k. 

The  likelihood  ratio  test  of  HQ:  «*  1/t,  i  =>  1,  ...,  t,  versus 

H  :  v.  *  1/t  for  some  i,  is  based  on 

d  1 

-2  log  X$  *  2N  log  6  ♦  2  ^  ai  log  pt  -  2  l  ^  n„klog  Di ’  (4*22) 


where  N  *  T  n, ...  Under  Hn,  -2  log  X.  has  the  central  chi-square 
i<j<k  13K  0  5 

distribution  with  (t-1)  degrees  of  freedom  for  large  N. 

Park  (1961)  applied  the  Pendergrass- Bradley  procedures  to  experimental 
data  and  compared  the  results  with  those  from  companion  experiments  using 
paired  comparisons.  He  found  good  model  fits  and  estimator  agreement. 

5.  Treatment  Contrasts  and  Factorials 

It  became  apparent  very  early  in  applications  of  paired  comparisons 
to  sensory  experimentation  that  there  was  need  for  special  analyses  when 
the  treatments  represented  factorial  treatment  combinations .  Abelson  and 
Bradley  (1954)  attempted  to  address  this  need  with  very  limited  success 
and  it  remained  an  open  problem  until  solved  by  Bradley  and  El-Helbawy 
(1976)-.  They  considered  factorial  treatment  combinations  in  the  more 
general  framework  of  specified  treatment  contrasts.  This  simplified  both 
notation  and  theory. 

In  Table  6,  we  show  paired  comparisons  data  for  treatments  representing 

a  2"  factorial  set  of  treatment  combinations .  The  data  are  taken  from 

Bradley  and  El-Helbawy  (1976)  and  arise  from  a  consumer  preference  taste 

test  on  coffees,  where  the  factors  are  brew  strength,  roast  color  and  coffee 

orand,  each  at  two  levels.  Twenty-six  preference  judgments  were  obtained 

on  each  of  the  28  distinct  treatment  comparisons.  Note  that  it  is  convenient 

to  replace  the  typical  treatment  T.  by  T  ,  a.  =  1  or  0,  i  =  1,  2,  3, 

l  aia2°3  1 

so  that  the  subscripts  indicate  the  chosen  levels  of  the  factors.  We  shall 
return  to  these  data  to  illustrate  use  of  the  general  method  explained 
below  with  factorials. 


32 


Dii  >  ®*  the  t-s^uare  identity  matrix.  Note  the  similar  forms  in  (5.6) 
and  (3.2).  If  m  =  0,  the  estimation  process  involves  solution  of  (3.2) 
replaced  by  (5.3). 

Iterative  solution  of  equations  (5.4)  is  discussed  briefly  by  Bradley 
and  El-Helbawy  (1976)  and  in  detail  by  El-Helbawy  and  Bradley  (1977).  In 
the  latter  reference,  it  is  shown  that  the  proposed  iterative  procedure 
converges  and  yields  a  maximum  of  the  likelihood  function  over  the  parameter 
space  (w:  >  0,  i  »  1,  ....  t,  J  log  *i  «  0,  log  w  » 


A  class  of  likelihood  ratio  tests  may  be  developed.  Let  jg^  ,  ^  ,  and 
fB  a  1 


a 


®m. 


be 


matrices  like  B  ,  0  £  m  ,  m,  £  m_  £  (t-1),  mA  »  a»  ♦  m,. 

~m  a  i  u  u  a  i 


With  the  condition  that  £  log  rr^  ■  0,  we  test 


H_:  B  log  ir 
0  6  ~ 


(5.8) 


against 


H  :  B 
a 


log  w  *»  0. 


(5.9) 


a 


The  test  statistic  is 


-2  ‘°s  V.«  * 

u  a 


(5.10) 


where  Bj  is  defined  in  (3.6),  and,  for  large  N  «  J  n^  and  under  HQ  in 


(5.8),  the  statistic  has  the  central  chi-square  distribution  with  degrees 


of  freedom.  In  (5.10),  pA  is  the  solution  of  (5.4)  where  B  *  B  and  p  , 

~m  i,a 

the  solution  when  Bag 

'll  ~m 


Basically,  the  test  involves  the  assumption  that 
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log  rr 


0  , 
~m  +1 
a 


and  a  test  of  the  additional  constraints. 


\  l0«  2 


0 

~m. 


consisting  of  m.  orthonornal  rows  orthogonal  to  those  of  B  . 

1  ~®a 

The  test  procedure  is  illustrated  with  the  data  of  Table  6.  Treatments 

Ti  have  subscripts  in  the  lexicographic  order  of  Tq  in  the  table.  Suppose 

that  we  wish  to  test  the  hypothesis  that  there  are  no  two-factor  interactions 

on  the  assumption  that  there  is  no  throe- factor  interaction.  Then  t  *»  8, 

ma  3  l»  mi  *  3,  m0  •  4  with 

V*°'  •l*  •*'  *• 

and 


1 

1 


Necessary  calculations  yield: 


PQ  *  (1.300,  1.275,  1.060,  1.040,  0.962,  0.944,  0.784,  0.769), 

pa  «  (1.515,  1.060,  1.342,  0.855,  0.790,  1.193,  0.647,  0.890), 

»!<&)  8  497.81,  B^)  -  490.14, 

"2  l0g  \  m  “  2(497.81-490.14)  «  15.34. 

O’  a 
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The  statistic,  -2  log  \  „  has  the  central  chi-square  distribution  with 

n0'ma 

3  degrees  of  freedom  and  is  large.  It  is  possible  also  to  partition  this 
chi-square  into  three  chi-squares,  each  with  1  degree  of  freedom,  as  is 
done  in  Table  7. 

The  general  test  procedure  for  hypothesis  (5.8)  versus  (5.9)  based  on 
the  statistic  (5.10)  may  be  used  repeatedly  to  produce  an  analysis  of  chi- 
square  table.  Two  such  analyses  are  given  in  Tables  7  and  8  for  the  data 
of  Table  6.  Rows  in  these  tables  correspond  to  rows  of  the  usual  analysis 
of  variance  table  for  a  23  factorial  and  similar  descriptive  terms  have 
been  used.  In  order  to  preserve  orthogonality  of  the  various  chi-squares, 
they  must  be  sequenced  properly;  each  row  requires  that  certain  conditions 
be  assumed,  equivalent  to  the  specification  of  .  Both  Tables  7  and  8 
are  shown  to  illustrate  two  different  sequencings  of  the  rows  and  to  suggest 
that  the  choice  of  sequencing  does  not  have  substantial  effects  on  the  in¬ 
ferences  that  may  be  made.  Additional  details  on  computations  for  Tables  7 
and  8  are  given  by  Bradley  and  El-Helbawy  (1976). 

The  analyses  below  were  done  through  recognition  of  the  factorial 
structure  of  the  treatments.  Factorial  parameters  may  be  introduced  for- 

3 

mally,  although  it  is  not  necessary  to  do  so.  We  illustrate  with  the  2 
factorial .  Let  it  replace  it ,  for  the  treatment  T  =  T. ,  where 

8  i  2  1 

u  ■  (oj,  a2,  Oj),  a  •  0  or  1,  r  ■  1,  2,  3.  We  reparameterize  by  writing 


it 


2 


3 

n 

r«l 


n 

r<s 


_(rs) 

Vs 


WU23) 

ala2°3* 


(5.11) 


The  parameters  on  the  right-hand  side  of  (5.11)  are  new  factorial  parameters. 
The  transformation  is  linear  if  logarithms  are  taken;  the  logarithms  of  the 


Table  7 

Analysis  of  Chi-square  for  the  Coffee  Data 


00 

Oi 

Tt 

CM 

CM 

© 

to 

oi 

© 

to 

to  00 

VO  LD 
•  • 
O  CT> 
CM 


cm  \0  in 

N  fll  H 


©  © 


r- 

tO 

CN 

CM 

CO 

to 

o 

pH 

vO 

in 

• 

• 

• 

• 

•71 

o 

in 

pH 

o 

Oi 

CM 

lO  M 
tw  N  N 

©  ■©•  © 


10 

B 

O 

♦pH 

4-» 

© 

V) 

o3 

B 

B 

O 

© 

to 

*H 

4-1 

c 

♦j 

B 

o 

o 

•H 

•H 

V) 

CO 

44 

B 

C 

C 

Vi 

to 

© 

o 

o 

o 

« 

a. 

G3 

•H 

•H 

•H 

44 

CM 

B 

4-> 

4-> 

B 

B. 

© 

C J 

© 

o 

*H 

pH 

♦j 

03 

o3 

do 

B 

B 

tO 

B 

B 

to 

•H 

© 

B 

© 

© 

u« 

* 

4J 

O 

4-> 

CM 

to 

o 

B 

•H 

B 

s 

tt, 

do 

B 

•H 

•H 

•H 

pH 

CM 

© 

u- 

u* 

•* 

o 

(0 

to 

to 

to 

B 

*4 

u. 

u. 

•k 

•k 

do 

V 

CM 

CM 

to 

to 

* 

44 

u. 

w< 

u. 

d. 

CM 

to 

B 

H 

pH 

CM 

pH 

u. 

u* 

•H 

u. 

u. 

d. 

d. 

4> 

o 

o 

o 

O 

o 

o 

o 

o 

§ 

O 

2 

2 

2 

2 

2 

2 

2 

2 

2 

© 

B  » 

CO  B  T3 

3  Og 

cr  -h  a 
m  •*->  3 

I  >h  « 
■h  tj  in 
S  cO 

o 

<mI  U 

o 


to 

to 

C 

c 

o 

o 

-pH 

•pH 

44 

u 

U 

rt 

(3 

B 

B 

B 

O 

© 

© 

•pH 

4-> 

B 

C 

© 

•H 

*H 

o3 

B 

to 

B 

© 

do 

O 

4J 

pH 

B 

a. 

© 

•pH 

(0 

Ok 

CM  CM 

i 

do 

u. 

o 

H 

s 

d. 

do 

4-> 

o 

o 

o 

c 

c 

c 

to 

44 

* 

• 

O 

to 

to 

to 

to 

© 

4-» 

4J 

44 

4-> 

4-» 

44 

o 

U 

O 

o 

o 

4-* 

44 

a> 

© 

« 

« 

© 

u 

V 

44 

44 

44 

4i 

4> 

44 

44 

44 

44 

44 

<4-1 

CM 

© 

© 

© 

© 

© 

44 

U. 

© 

B 

B 

B 

B 

B 

•» 

•H 

•H 

-H 

*H 

*H 

H 

« 

cO 

« 

08 

©  to. 

U. 

s 

B 

u 

I 

B 

§  o 

o 

O 

o 

o 

o 

O 

o 

2  2 

2 

2 

2 

2 

2 

2 

2 

•O  B  c  C 

tu  5  o  o 

CM  -H  -H  -H 

U.  4-»  4->  4-> 

you 

»  CO  CO  tO 

K>  B  H  B 


44 

4J 

*4 

u.  © 

© 

U 

B 

44 

© 

O 

© 

—t  44 

+-» 

4-1 

•H 

B 

© 

© 

© 

(4  B 

B 

B 

© 

44 

44 

44 

•H 

•H 

•H 

to 

B 

44 

44 

44 

•* 

u. 

4-> 

e 

V 

© 

CM  IO 

to 

CM 

CM 

Cfl 

u,  u. 

w. 

w. 

do 

« 

CN 

to 

*H  CM 

*H 

cH 

pH 

B 

to. 

do 

to. 

u.  u. 

do 

u. 

do 

4J 

O 

o 

o 

o  o 

o 

o 

o 

o 

2 

2 

2 

2  2 

2 

2 

2 

2 

36 


new  factorial  parameters  are  subject  to  the  usual  linear  constraints  for 
factorial  parameters  in  the  analysis  of  variance  in  order  to  make  the  trans 
formation  one-to-one.  Estimators  of  the  factorial  parameters  are  functions 
of  the  estimators  pQ.  A  full  explanation  of  these  procedures  is  given  by 
El-Helbawy  and  Bradley  (1976). 

Special  treatment  contrasts  may  be  of  interest  in  paired  comparisons. 
Suppose  that,  in  a  coffee  taste  test  experiment  with  t  *  4,  represents 
an  experimental  coffee  produced  by  a  new  process  while  the  other  treatments 
came  from  a  standard  process.  One  may  wish  to  compare  with  the  other 
three  treatments.  Two  approaches  are  possible.  The  first  assumes  nothing. 


m  a  0,  and  takes 
& 


B,  -  (1,  I.  1,  -3). 

1  /12 

The  second  approach  assumes  that  ^  »  ir^,  raa  =  2, 


ri//2  -I//? 


\  8  ll/*  1//6  -2//6 


and  retains  the  same  B  .  With  these  matrices  defined,  the  general  test 

1 

procedure  of  this  section  is  used. 

We  have  presented  a  method  for  the  examination  of  specified  treatment 
contrasts  and  the  analysis  of  factorial  paired  comparison  experiments  to¬ 
gether  with  examples.  These  methods  provide  much  new  flexibility. 
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6.  Multivariate  Paired  Comparisons 

Multivariate  responses  to  paired  comparisons  are  often  obtained.  For 
example,  this  happens  in  consumer  testing  where,  on  paired  samples,  prefer¬ 
ences  on  a  number  of  characteristics  are  solicited. 

Davidson  and  Bradley  (1969)  extended  the  paired  comparisons  model  to 
the  multivariate  case.  Let  g  »  (Sj,  ...»  s^),  sq  =  i  or  j,  be  the  response 
vector  on  attributes  a  *  1,  ...,  p  for  the  treatment  pair  (T^,  T\),  sq  ®  i 
indicating  preference  for  T\  on  attribute  a.  The  probability  of  response 
s  on  (Tj,  T.)  is 

P(s|i,j)  ■  p(1)(s|i,j)h(s|i,j),  (6.1) 

where 

P(1)(s|i,j)  -  n  tras  /(ira  ♦»  )  (6.2) 

a»l  a  i  j 

and 

.  -6(i,Sft)/2  -6(i,Sft)/2 

h(s|ij)  c  1  ♦  I  fiCsa.sB)poB(xa./xaj)  (*MABj)  B  .  (6-3) 


for  all  s,  i  <  j,  i,  j  »  1,  ...,  t.  Notation  is  as  follows:  ir^  is  the 
worth  parameter  for  on  attribute  a,  ][  ir^  *  1,  pqB  is  a  "correlation" 

parameter  for  attributes  a  and  0  assumed  constant  for  all  trc'tment  pairs, 
and  6(s  ,  s  )  s  1  or  *1  as  the  two  arguments  of  the  indie  r  function  agree 

o  P 

or  disagree.  Note  that  p  *  0  implies  independence  of  responses  on  attributes 
p  has  typical  element  pQg.  It  is  necessary  to  restrict  the  parameter  space 

so  that  *«i  2  °-  °  *  1 . *  1  *  1 . *•  and  h(s|i,j)  2  0  for  each  of 

the  2 P  cells  associated  with  each  of  the  (*)  treatment  pairs. 
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Let 


B(£)  - 


-1 


W 


(6.4) 


and 

C(£»  P)  -  l  l  f(s|i,j)  log  h(s|i,j),  (6.5) 

~  i<j  s 

where  w  has  typical  element  ir  .  and  it  is  the  a**1  row  of  ir.  The  quantity 
~  ai  ~a  ~ 

BjCi^)  is  the  function  Bj  of  (3.6)  with  p^  there  replaced  by  and  a^ 

replaced  by  a^,  the  total  number  of  preferences  for  on  attribute  a. 

In  addition,  f(g|i,j)  is  the  number  of  times  the  preference  vector  s  occurs 

among  the  n..  responses  to  the  pair  (T. ,  T.).  We  may  express  the  logarithm 
»  ^  3 

of  the  likelihood  function  as 


log  L  »  CQt,  p)  ♦  B(tt). 


(6.6) 


Consider  first  a  test  for  independence:  H_:  p  »  0  versus  H  :  p  „  *  0 

U  **  **  fl.  Qp 

for  some  a  <  B,  a,  B  *  1,  . . . ,  p.  Under  HQ,  the  likelihood  equations  reduce 
to  equations  (3.2)  and  (3.3)  for  each  a  =  1,  ...»  p.  If  p^  is  the  solution 
for  the  a**1  set  of  equations  and  becomes  the  a**1  row  of  p°,  p°  estimates  tt 
under  HQ.  Under  Ha»  the  equations  to  be  solved  are: 


l  f(s|i,j)h'1(s|i,j)6(sft,sfl)(TT^/1rn.) 
a  <  0,  a,  8  >  1,  p. 


*a*  B"* v  oi'  ajJ 


-6(i,s  )/2  -6(i,sJ/2 

W 


0, 


rs 

p=p 


a  .♦R  . 
ai  qi 

Pqi 


(6.7) 


-  I 


n.  . 
_I2_ 


J  Pai+Pqj 


0, 


i  a  1,  ...,  t,  a  *  1,  ...,  p. 


l  Pai  =  °  9  *»  •••»  P» 

l 


where 


Rai  "  +  L  l  £U|i.J)»»’  UU.J)  * 

j  £ 


-6(i,s  )/2  -6(i,s  )/2 

<W  l  ‘W-WW 


(6.8) 


Solutions  of  equations  (6.7)  is  discussed  by  Davidson  and  Bradley  (1969). 

If  we  let  p  and  p  be  the  estimators  of  it  and  p  from  equations  (6.7),  the 
likelihood  ratio  test  statistic  is 

-2  log  X.  »  2{B(p)-B(p°)*C(p,p)}  (6.9) 

O  «W  «W  M 

and,  under  HQ,  it  has  the  central  chi-square  distribution  with  ^(p-l) 
degrees  of  freedom. 

If  it  is  assumed  that  p  =  0,  tests  on  the  parameters  tt^  may  be  made 
separately  as  in  the  univariate  case  for  each  a  =  1,  ...,  p. 

An  overall  test  of  no  treatment  preferences  may  be  made  in  the  presence 
of  correlations.  Then  we  have  HQ:  it  *  [1/t]  and  Ha:  ir^  *  1/t  for  some  a 

A 

and  i.  Under  H  ,  the  estimators  from  equations  (6.7)  are  again  p  and  p. 

a  u  - 

Under  H_,  the  estimators  of  it  and  p  are  [1/t]  and  p_,  the  latter  obtained 
from  solution  of  (6.7)  with  p  =  [1/t].  The  test  statistic  is 

-2  log  X?  »  2{B(g)  ♦  C(p,p)  +  pN  log  2-C(l/t,P0)>  (6.10) 

with  the  central  chi-square  distribution  with  p(t-l)  degrees  of  freedom 
under  HQ. 

A  likelihood  ratio  test  of  the  fit  of  the  model  (6.1)  is  given  by 
Davidson  and  Bradley.  An  alternative  test  may  be  based  on 
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X2  -  I  I  {f(s|ij)-f(s|i,j)>2/f(sii.j)  (6.11) 

i<i  s 

and,  under  the  model,  has  the  central  chi-square  distribution  for  large  N 
with  ((2^-l)(!?)-p(t-l)-(?)}  degrees  of  freedom.  The  estimators  p  and  p 

fc  fc  «v 

are  substituted  in  (6.1)  to  obtain  expected  cell  frequencies 
£(s|i,j)  *  n^  P(js|i,j). 

Davidson  and  Bradley  (1970)  examine  large-sample  properties  of  procedures 
discussed  above.  Davidson  and  Bradley  (1971)  examine  regression  relationships 
among  the  characteristics  in  the  multivariate  problem. 

We  conclude  this  section  with  one  of  the  examples  given  by  Davidson 
and  Bradley  (1969).  Table  9  shows  the  observed  and  expected  cell  frequencies, 
the  latter  in  parentheses,  for  a  chocolate  pudding  test  with  t  =  3,  p  *  3, 
the  treatments  being  brands,  and  the  attributes  being  taste,  color  and 
texture. 


Table  9 

Observed  and  Expected  Cell  Frequencies 
for  a  Chocolate  Pudding  Test 


Treatment 

Pair 

Cell  Frequencies  f(s)i,j) 

Frequency 

i»  j 

Cells  s 

(iii)  (jii)  (iji)  (jji)  (iij)  (jij)  (ijj)  (jjj) 

nij 

1.  2 

8  1  1  1  0  2  0  9 

(7.93)  (1.09)  (1.15)  (1.69)  (0.76)  (0.97)  (0.37)  (8.03) 

22 

1.  3 

6  0  1110  19 

(6.25)  (0.60)  (1.24)  (0.92)  (1.12)  (0.62)  (0.64)  (7.61) 

19 

2,  3 

7  1113  116 

(6.92)  (0.37)  (1.26)  (0.60)  (1.70)  (0.75)  (1.10)  (8.31) 

21 
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Details  on  calculations  are  not  given.  However,  as  a  possible  check 
on  computer  programming,  the  solution  of  (6.7)  is  as  follows: 


0.312 

0.360 

0.328 

A 

P12 

=  0.675 

0.307 

0.321 

0.372 

*  P13 

=  0.6S4 

0.338 

0.288 

0.374 

P23 

=  0.588 

Tests  are  summarized  in  Table  10.  It  is  seen  that  the  major  effects  are 
the  high  correlations  among  responses  on  attributes. 


Table  10 

Test  Statistics  for  Hypotheses 
for  the  Chocolate  Pudding  Data 


Test 

Statistic 

Ref.  No. 

Value  d.f. 

Test  of  Independence 

-2  log  X6 

(6.9) 

62.665  3 

Test  of  Equal  Inferences 

-2  log  *7 

(6.10) 

2.362  6 

Test  of  Model  Fit 

2 

X 

(6.11) 

7.557  12 

As  a  final  comment  on  the  example,  cell  frequencies  are  small  and 
asymptotic  theory  must  be  regarded  only  as  approximate.  The  tests  do, 
however,  seem  to  work  well  and  be  adequately  indicative. 

7.  Other  Methods  of  Paired  Comparisons 

Our  efforts  in  this  chapter  have  concentrated  on  one  method  of  paired 
comparisons  and  its  extensions.  This  was  done  because  it  has  been  most 
fully  developed  and  has  been  found  to  work  well  in  applications.  Even  so, 
it  has  been  necessary  to  be  brief  and  applications  require  computer  programs 
that  are  easily  developed  after  review  of  pertinent  references  for  additional 


detail. 


42 


We  have  seen  that  the  Thurstone  model  is  very  similar  to  the  one  used 
here.  It  has  had  less  attention.  However,  three  papers  do  extend  the 
Thurstone  model:  Harris  (1957)  generalized  the  model  to  allow  for  possible 
order  effects,  Glenn  and  David  (1960)  allowed  for  ties,  and  Sadasivan  (1982) 
permitted  unequal  numbers  of  judgments  on  pairs. 

Other  approaches  to  the  analysis  of  paired  comparisons  exist.  Kendall 
and  Babington  Smith  (1940)  considered  the  count  of  circular  triads  as  a 
measure  of  consistency  of  judgments  and  also  developed  a  coefficient  of 
concordance  as  a  measure  of  agreement  of  judgments  by  several  judges. 

Guttman  (1946)  developed  a  method  of  scaling  treatments  in  paired  comparisons, 
the  objective  of  Zermello.  Saaty  (1977)  proposed  a  consensus  method  through 
evaluation  by  group  discussion  to  provide  treatment  or  item  scores  on  a 
ratio  scale.  Bliss,  Greenwood  and  White  (1956)  used  "Tankits"  in  the  analysis 
of  paired  comparisons.  Mehra  (1964)  and  Puri  and  Sen  (1969)  extended  the 
idea  of  signed  ranks  to  paired  comparisons.  Wei  (1952)  and  Kendall  (1955) 
have  proposed  an  iterative  scoring  system  that  takes  into  account  not  only 
direct  comparisons  but  also  roundabout  comparisons  involving  other  items. 

No  attention  has  been  given  here  to  the  design  of  tournaments.  There 
is  an  extensive  literature  on  this  subject  included  in  the  Davidson- Farquhar 
bibliography. 
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